200 research outputs found

    Efficient simulation of view synchrony

    Get PDF
    This report presents an algorithm for efficiently simulating view synchrony, including failure-atomic total-order multicast in a discrete-time event simulator. In this report we show how a view synchrony implementation tailored to a simulated environment removes the need for third party middleware and detailed network simulation, thus reducing the complexity of a test environment. An additional advantage is that simulated view synchrony can generate all timing behaviours allowed by the model instead of just those exhibited by a particular view synchrony implementation

    Atomic commitment in transactional DHTs

    Get PDF
    We investigate the problem of atomic commit in transactional database systems built on top of Distributed Hash Tables. DHTs provide a decentralized way to store and look up data. To solve the atomic commit problem we propose to use an adaption of Paxos commit as a non-blocking algorithm. We exploit the symmetric replication technique existing in the DKS DHT to determine which nodes are necessary to execute the commit algorithm. By doing so we achieve a lower number of communication rounds and a reduction of meta-data in contrast to traditional Three-Phase-Commit protocols. We also show how the proposed solution can cope with dynamism due to churn in DHTs. Our solution works correctly relying only on an inaccurate failure detection of node failure which is necessary for systems running over the Internet

    Kompics: a message-passing component model for building distributed systems

    Get PDF
    The Kompics component model and programming framework was designedto simplify the development of increasingly complex distributed systems. Systems built with Kompics leverage multi-core machines out of the box and they can be dynamically reconfigured to support hot software upgrades. A simulation framework enables deterministic debugging and reproducible performance evaluation of unmodified Kompics distributed systems. We describe the component model and show how to program and compose event-based distributed systems. We present the architectural patterns and abstractions that Kompics facilitates and we highlight a case study of a complex distributed middleware that we have built with Kompics. We show how our approach enables systematic development and evaluation of large-scale and dynamic distributed systems

    Moving the shared memory closer to the processors: DDM

    Get PDF
    Multiprocessors with shared memory are considered more general and easier to program than message-passing machines. The scalability is, however, in favor of the latter. There are a number of proposals showing how the poor scalability of shared memory multiprocessors can be improved by the introduction of private caches attached to the processors. These caches are kept consistent with each other by cache-coherence protocols. In this paper we introduce a new class of architectures called Cache Only Memory Architectures (COMA). These architectures provide the programming paradigm of the shared-memory architectures, but are believed to be more scal- able. COMAs have no physically shared memory; instead, the caches attached to the processors contain all the memory in the system, and their size is therefore large. A datum is allowed to be in any or many of the caches, and will automatically be moved to where it is needed by a cache-coherence protocol, which also ensures that the last copy of a datum is never lost. The location of a datum in the machine is completely decoupled from its address. We also introduce one example of COMA: the Data Diffusion Machine (DDM). The DDM is based on a hierarchical network structure, with processor/memory pairs at its tips. Remote accesses generally cause only a limited amount of traffic over a limited part of the machine. The architecture is scalable in that there can be any number of levels in the hierarchy, and that the root bus of the hierarchy can be implemented by several buses, increasing the bandwidth

    CATS: linearizability and partition tolerance in scalable and self-organizing key-value stores

    Get PDF
    Distributed key-value stores provide scalable, fault-tolerant, and self-organizing storage services, but fall short of guaranteeing linearizable consistency in partially synchronous, lossy, partitionable, and dynamic networks, when data is distributed and replicated automatically by the principle of consistent hashing. This paper introduces consistent quorums as a solution for achieving atomic consistency. We present the design and implementation of CATS, a distributed key-value store which uses consistent quorums to guarantee linearizability and partition tolerance in such adverse and dynamic network conditions. CATS is scalable, elastic, and self-organizing; key properties for modern cloud storage middleware. Our system shows that consistency can be achieved with practical performance and modest throughput overhead (5%) for read-intensive workloads

    Symmetric Replication for Structured Peer-to-Peer Systems

    Get PDF
    Structured peer-to-peer systems rely on replication as a basic means to provide fault-tolerance in presence of high churn. Most select replicas using either multiple hash functions, successor-lists, or leaf-sets. We show that all three alternatives have limitations. We present and provide full algorithmic speci¯cation for a generic replication scheme called symmetric replication which only needs O(1) message for every join and leave operation to maintain any replication degree. The scheme is applicable to all existing structured peer-to-peer systems, and can be implemented on-top of any DHT. The scheme has been implemented in our DKS system, and is used to do load-balancing, end-to-end fault-tolerance, and to increase the security by using distributed voting. We outline an extension to the scheme, implemented in DKS, which adds routing proximity to reduce latencies. The scheme is particularly suitable for use with erasure codes, as it can be used to fetch a random subset of the replicas for decoding

    Physics-inspired Performace Evaluation of a Structured Peer-to-Peer Overlay Network

    Get PDF
    In the majority of structured peer-to-peer overlay networks a graph with a desirable topology is constructed. In most cases, the graph is maintained by a periodic activity performed by each node in the graph to preserve the desirable structure in face of the continuous change of the set of nodes. The interaction of the autonomous periodic activities of the nodes renders the performance analysis of such systems complex and simulation of scales of interest can be prohibitive. Physicists, however, are accustomed to dealing with scale by characterizing a system using intensive variables, i.e. variables that are size independent. The approach has proved its usefulness when applied to satisfiability theory. This work is the first attempt to apply it in the area of distributed systems. The contribution of this paper is two-fold. First, we describe a methodology to be used for analyzing the performance of large scale distributed systems. Second, we show how we applied the methodology to find an intensive variable that describe the characteristic behavior of the Chord overlay network, namely, the ratio of the magnitude of perturbation of the network (joins/failures) to the magnitude of periodic stabilization of the network

    Handling Network Partitions and Mergers in Structured Overlay Networks

    Get PDF
    Structured overlay networks form a major class of peer-to-peer systems, which are touted for their abilities to scale, tolerate failures, and self-manage. Any long-lived Internet-scale distributed system is destined to face network partitions. Although the problem of network partitions and mergers is highly related to fault-tolerance and self-management in large-scale systems, it has hardly been studied in the context of structured peer-to-peer systems. These systems have mainly been studied under churn (frequent joins/failures), which as a side effect solves the problem of network partitions, as it is similar to massive node failures. Yet, the crucial aspect of network mergers has been ignored. In fact, it has been claimed that ring-based structured overlay networks, which constitute the majority of the structured overlays, are intrinsically ill-suited for merging rings. In this paper, we present an algorithm for merging multiple similar ring-based overlays when the underlying network merges. We examine the solution in dynamic conditions, showing how our solution is resilient to churn during the merger, something widely believed to be difficult or impossible. We evaluate the algorithm for various scenarios and show that even when falsely detecting a merger, the algorithm quickly terminates and does not clutter the network with many messages. The algorithm is flexible as the tradeoff between message complexity and time complexity can be adjusted by a parameter

    Kernel Andorra Prolog and its computation model

    Get PDF
    The logic programming language framework Kernel Andorra Prolog is defined by a formal computation model. In Kernel Andorra Prolog, general combinations of concurrent reactive languages and nondeterministic transformational languages may be specified. The framework is based on constraints
    corecore